Relative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations
نویسندگان
چکیده
Recognition of Multi-word Expressions (MWEs) and their relative compositionality are crucial to Natural Language Processing. Various statistical techniques have been proposed to recognize MWEs. In this paper, we integrate all the existing statistical features and investigate a range of classifiers for their suitability for recognizing the non-compositional Verb-Noun (V-N) collocations. In the task of ranking the V-N collocations based on their relative compositionality, we show that the correlation between the ranks computed by the classifier and human ranking is significantly better than the correlation between ranking of individual features and human ranking. We also show that the properties ‘Distributed frequency of object’ (as defined in [27]) and ‘Nearest Mutual Information’ (as adapted from [18]) contribute greatly to the recognition of the non-compositional MWEs of the V-N type and to the ranking of the V-N collocations based on their relative compositionality.
منابع مشابه
Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features
Measuring the relative compositionality of Multi-word Expressions (MWEs) is crucial to Natural Language Processing. Various collocation based measures have been proposed to compute the relative compositionality of MWEs. In this paper, we define novel measures (both collocation based and context based measures) to measure the relative compositionality of MWEs of V-N type. We show that the correl...
متن کاملLexical and Grammatical Collocations in Writing Production of EFL Learners
Lewis (1993) recognized significance of word combinations including collocations by presenting lexical approach. Because of the crucial role of collocation in vocabulary acquisition, this research set out to evaluate the rate of collocations in Iranian EFL learners' writing production across L1 and L2. In addition, L1 interference with L2 collocational use in the learner' writing samples was st...
متن کاملUsing Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality
We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...
متن کاملShared Task System Description: Measuring the Compositionality of Bigrams using Statistical Methodologies
The measurement of relative compositionality of bigrams is crucial to identify Multi-word Expressions (MWEs) in Natural Language Processing (NLP) tasks. The article presents the experiments carried out as part of the participation in the shared task ‘Distributional Semantics and Compositionality (DiSCo)’ organized as part of the DiSCo workshop in ACLHLT 2011. The experiments deal with various c...
متن کاملCombining Different Features of Idiomaticity for the Automatic Classification of Noun+Verb Expressions in Basque
We present an experimental study of how different features help measuring the idiomaticity of noun+verb (NV) expressions in Basque. After testing several techniques for quantifying the four basic properties of multiword expressions or MWEs (institutionalization, semantic non-compositionality, morphosyntactic fixedness and lexical fixedness), we test different combinations of them for classifica...
متن کامل